mla by feifei14119 · Pull Request #1280 · ROCm/ATOM

feifei14119 · 2026-06-18T08:25:11Z

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Copilot

Pull request overview

This PR updates ATOM’s MLA attention stack to support/use segmented MLA KV-cache kernels and a configurable MLA page size, and propagates the new fused “_seg” kernel entrypoints into vLLM/SGLang plugin integrations.

Changes:

Add ATOM_MLA_PAGE_SIZE env var and use it to configure MLA metadata builder block/page sizing.
Switch multiple call sites to segmented fused MLA cache-update kernels (*_mla_seg) and add segmented-layout handling/validation in MLAAttention.
Adjust MLA decode/prefill paths to pass/use the actual KV cache page size (instead of implicitly assuming 1).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`atom/utils/envs.py`	Adds `ATOM_MLA_PAGE_SIZE` env var for configuring MLA page/block sizing.
`atom/model_ops/attentions/aiter_mla.py`	Uses `ATOM_MLA_PAGE_SIZE` to set the metadata builder’s `block_size`.
`atom/model_ops/attention_mla.py`	Implements segmented KV-cache layout support, adds validation, adjusts page-size handling, and updates kernel call paths.
`atom/plugin/vllm/attention/layer_mla.py`	Updates vLLM plugin calls to use segmented fused MLA cache-update kernel.
`atom/plugin/sglang/models/deepseek_mla_attention.py`	Updates SGLang plugin call to use segmented fused MLA cache-update kernel wrapper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

        os.getenv("ATOM_USE_TRITON_MLA_SHUFFLE_KV", "0") == "1"
    ),
    "ATOM_USE_TRITON_MOE": lambda: os.getenv("ATOM_USE_TRITON_MOE", "0") == "1",
+    "ATOM_MLA_PAGE_SIZE": lambda: int(os.getenv("ATOM_MLA_PAGE_SIZE", "1")),


+        # DEBUG(seg): zero-init instead of empty so any region the decode asm
+        # does not write shows up as 0 rather than garbage (isolates
+        # uninitialized-read bugs in the seg pass).
+        o = torch.zeros(
            B,


+        # DEBUG(seg): zero-init instead of empty so any region the decode asm
+        # does not write shows up as 0 rather than garbage (isolates
+        # uninitialized-read bugs in the seg pass).
+        o = torch.zeros(
            B,


+            # ids at block granularity, so PAGE_SIZE must be the real KV cache
+            # block size for the kernel's page// and intra-page% addressing.
+            page_size = get_current_atom_config().kv_cache_block_size
+            logger.info("triton_mla decode: page_size=%d", page_size)


+                q_out = torch.zeros(
+                    (
+                        q_nope.shape[0],
+                        self.num_heads,
+                        _MLA_Q_OUT_PADDED_DIM,
+                    ),
+                    dtype=attn_metadata.dtype_q,
+                    device=q_nope.device,
+                )


Copilot

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

Copilot

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

Copilot

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

Copilot

Copilot was unable to review this pull request because the user who requested the review is ineligible. To be eligible to request a review, you need a paid Copilot license, or your organization must enable Copilot code review.

Copilot AI review requested due to automatic review settings June 18, 2026 08:25

Copilot started reviewing on behalf of feifei14119 June 18, 2026 08:26 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

HaonanWang98 force-pushed the feiw/pr/mla2 branch from 76d4fb7 to ff20bd8 Compare June 18, 2026 10:10

Copilot AI review requested due to automatic review settings June 18, 2026 12:29

Copilot AI reviewed Jun 18, 2026

mla

4c4a325

HaonanWang98 force-pushed the feiw/pr/mla2 branch from 98b0cf0 to 4c4a325 Compare June 18, 2026 14:24

revert some non-aiter related changes

c7bf6af

Copilot AI review requested due to automatic review settings June 19, 2026 06:15

Copilot AI reviewed Jun 19, 2026

HaonanWang98 added 2 commits June 19, 2026 09:42

fix format

749aaf4

fix block-size usage

703d17d

Copilot AI review requested due to automatic review settings June 19, 2026 12:59

Copilot AI reviewed Jun 19, 2026

HaonanWang98 added 2 commits June 19, 2026 13:08

fix block size and persistent mode

05980c3

fix block size to env var

9d7c49b

Copilot AI review requested due to automatic review settings June 19, 2026 13:32

Copilot AI reviewed Jun 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mla#1280

mla#1280
feifei14119 wants to merge 6 commits into
mainfrom
feiw/pr/mla2

feifei14119 commented Jun 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

feifei14119 commented Jun 18, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants